翻訳と辞書
Words near each other
・ Basis of Union
・ Basis of Union (Presbyterian Church of Australia)
・ Basis of Union (Uniting Church in Australia)
・ Basis path testing
・ Basis point
・ Basis point value
・ Basis pursuit
・ Basis pursuit denoising
・ Basis risk
・ BASIS Schools
・ BASIS Scottsdale
・ Basis set
・ Basis set (chemistry)
・ Basis set superposition error
・ Basis swap
Basis Technology Corp.
・ Basis theorem
・ Basis trading
・ Basisfjellet
・ Basisperma
・ Basissletta
・ Basista, Pangasinan
・ Basistha
・ Basistha Temple
・ Basistichus
・ Basistriga
・ Basiswar Sen
・ Basit
・ Basit (disambiguation)
・ Basit Ali


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Basis Technology Corp. : ウィキペディア英語版
Basis Technology Corp.

Basis Technology Corp. is a software company specializing in applying artificial intelligence techniques to understanding documents written in different languages. It has headquarters in Cambridge, Massachusetts and offices in San Francisco, Washington, D.C., London, and Tokyo.
The company was founded in 1995 by graduates of the Massachusetts Institute of Technology to use artificial intelligence techniques to help understand the many different languages that humans use. Its software focuses on finding structure inside text so algorithms can do a better job understanding the meaning of the words. The tools identify different forms of names and phrases. The name of someone, say Albert P. Jones for instance, can appear in many different ways. Some texts will call him "Al Jones", others "Mr. Jones" and others "Albert Paul Jons". Basis Technology's software can match all of these instances.
Their software enhances parsing tools by classifying the role of words and provides metadata on the role of words to other algorithms. Software from Basis Technology will, for instance, identify the language of an incoming stream of characters and then identify the parts of each sentence like the subject or the direct object.
The company is best known for its Rosette Linguistics Platform which uses Natural Language Processing techniques to improve information retrieval, text mining, search engines and other applications. The tool is used to create normalized forms of text by major search engines, and, translators. Basis Technology software is also used by forensic analysts to search through files for words, tokens, phrases or numbers that may be important to investigators.
== Rosette Linguistics Platform ==

The Rosette Linguistics Platform consists of a component library for multilingual text retrieval and analysis. Rosette provides automatic language identification, linguistic analysis, entity extraction, and entity translation from unstructured text. It can be integrated into applications to help analyse volumes of unstructured text.
The Rosette Linguistics Platform is composed of these modules:
* Rosette Language Identifier looks at the structural and statistical signature of the file to identify the language. The pre-configured software can recognize 55 different languages with 45 different encodings.
* Rosette Base Linguistics identifies the lemma or word stem after finding the tokens. Search is often faster and more accurate when words are grouped by their stem.
* Rosette Entity Extractor analyzes raw text and identifies the probable role that words and phrases play in the document, a key step that makes it possible for algorithms to distinguish between the various meanings that many words can have. Splitting the raw text into groups of words according to their role and then classifying their contribution to meaning is often called entity analysis. The Basis hybrid approach mixes statistical modeling with rules, regular expressions, and gazetteers, lists of special words that can be tuned to the language and text to be analyzed. The tool is designed to work directly with varied alphabets and multiple languages, an advantage because foreign words are often transliterated in multiple ways. It is believed to be the first commercially available tool for analyzing Arabic text.
* Rosette Name Translator transliterates non-Latin alphabets like Arabic into a consistent Latin form.
* Rosette Name Indexer builds an index filled with a consistent collection of names in order to simplify searching when there are multiple forms of the same name.〔(Profile in Boston Business Journal )〕
* Rosette Core Library for Unicode smooths the use of Unicode text.
* Rosette Chat Translator for Arabic converts words from the Arabic chat alphabet to Arabic.
The Rosette Platform is used in both the United States government offices to support translation and by major Internet infrastructure firms like search engines.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Basis Technology Corp.」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.